Multi-Lingual Phrase-Based Statistical Machine Translation for Arabic-English

نویسندگان

  • Ahmed Bastawisy
  • Mohamed Elmahdy
چکیده

In this paper, we implement a multilingual Statistical Machine Translation (SMT) system for Arabic-English Translation. Arabic Text can be categorized into standard and dialectal Arabic. These two forms of Arabic differ significantly. Different mono-lingual and multi-lingual hybrid SMT approaches are compared. Mono-lingual systems do always result in better translation accuracy in one Arabic form and poor accuracy in the other. Multi-lingual SMT models that are trained with pooled parallel MSA/dialectal data result in better accuracy. However, since the available parallel MSA data are much larger compared to dialectal data, multilingual models are biased to MSA. We propose in the work, a multi-lingual combination of different mono-lingual systems using an Arabic form classifier. The outcome of the classier directs the system to use the appropriate mono-lingual models (standard, dialectal, or mixture). Testing the different SMT systems shows that the proposed classifier-based SMT system outperforms mono-lingual and datapooled multi-lingual systems.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Web-based Demonstrator of a Multi-lingual Phrase-based Translation System

This paper describes a multi-lingual phrase-based Statistical Machine Translation system accessible by means of a Web page. The user can issue translation requests from Arabic, Chinese or Spanish into English. The same phrase-based statistical technology is employed to realize the three supported language-pairs. New language-pairs can be easily added to the demonstrator. The Web-based interface...

متن کامل

Syntactic Phrase Reordering for English-to-Arabic Statistical Machine Translation

Syntactic Reordering of the source language to better match the phrase structure of the target language has been shown to improve the performance of phrase-based Statistical Machine Translation. This paper applies syntactic reordering to English-to-Arabic translation. It introduces reordering rules, and motivates them linguistically. It also studies the effect of combining reordering with Arabi...

متن کامل

Tuning a phrase-based statistical translation system for the IWSLT 2005 Chinese to English and Arabic to English tasks

Nowadays, most of the statistical translation systems are based on phrases (i.e. groups of words). We describe a phrase-based system using a modified method for the phrase extraction which deals with larger phrases while keeping a reasonable number of phrases. Also, different alignments to extract phrases are allowed and additional features are used which lead to a clear improvement in the perf...

متن کامل

Improving Arabic-Chinese Statistical Machine Translation using English as Pivot Language

We present a comparison of two approaches for Arabic-Chinese machine translation using English as a pivot language: sentence pivoting and phrase-table pivoting. Our results show that using English as a pivot in either approach outperforms direct translation from Arabic to Chinese. Our best result is the phrase-pivot system which scores higher than direct translation by 1.1 BLEU points. An error...

متن کامل

Machine Translation System on the Pair of Arabic / English

our work fits into the project entitled "TELA": an environment for learning the Arabic language computer-assisted, which covers many issues related to the use of words in Arabic. This environment contains several sub-systems whose purpose is to provide an important educational function by allowing the learner to discover information beyond the scope of the phrase of the year. In these subsystem...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017